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THE AMENDMENTS 

In the Specification 

Amend the paragraph starting at page 9, line 24 to: 

The terms "identical" or percent "identity," in the context of two or more nucleic acids or 
polypeptide sequences, refer to two or more sequences or subsequences that are the same or have 
a specified percentage of amino acid residues or nucleotides that are the same (i.e., about 60% 
identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 
99%, or higher identity over a specified region, when compared and aligned for maximum 
correspondence over a comparison window or designated region) as measured using a BLAST or 
BLAST 2.0 sequence comparison algorithms with default parameters described below, or by 
manual alignment and visual inspection {see, e.g., NCBI web site http:// 
www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to be "substantially 
identical." This definition also refers to, or may be applied to, the compliment of a test 
sequence. The definition also includes sequences that have deletions and/or additions, as well as 
those that have substitutions, as well as naturally occurring, e.g., polymorphic or allelic variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps and 
the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or nucleotides 
in length. 

Amend the paragraph starting at page 10, line 27 to: 

Preferred examples of algorithms that are suitable for determining percent sequence 
identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 
described in Altschul et al, Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et al, J. Mol 
Biol 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the parameters described 
herein, to determine percent sequence identity for the nucleic acids and proteins of the invention. 
Software for performing BLAST analyses is publicly available through the National Center for 
Biotechnology Information (http^www.ncbi.nlm.nih.gov/). This algorithm involves first 
identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the 
query sequence, which either match or satisfy some positive-valued threshold score T when 
aligned with a word of the same length in a database sequence. T is referred to as the 
neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word 
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hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are 
extended in both directions along each sequence for as far as the cumulative alignment score can 
be increased. Cumulative scores are calculated using, e.g., for nucleotide sequences, the 
parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for 
mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to 
calculate the cumulative score. Extension of the word hits in each direction are halted when: the 
cumulative alignment score falls off by the quantity X from its maximum achieved value; the 
cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring 
residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters 
W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for 
nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, 
N— 4 and a comparison of both strands. For amino acid sequences, the BLASTP program uses 
as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see 
Henikoff & Henikoff, Proc. Natl Acad. Set USA 89:10915 (1989)) alignments (B) of 50, 
expectation (E) of 10, M=5, N— 4, and a comparison of both strands. 

Amend the paragraph starting at page 25, line 23: 

In a preferred embodiment, metastatic colorectal cancer sequences are those that are up- 
regulated in metastatic colorectal cancer; that is, the expression of these genes is higher in the 
metastatic tissue as compared to non-metastatic cancerous tissue or normal colon tissue (see, 
e.g., Tables 1-26). "Up-regulation" as used herein means, when the ratio is presented as a 
number greater than one, that the ratio is greater than one, preferably 1 .5 or greater, more 
preferably 2.0 or greater. All UniGene cluster identification numbers and accession numbers 
herein are for the GenBank sequence database and the sequences of the accession numbers are 
hereby expressly incorporated by reference. GenBank is known in the art, see, e.g., Benson, DA, 
et al 9 Nucleic Acids Research 26:1-7 (1998) and http^www.ncbi.nlm.nih.gov/. Sequences are 
also available in other databases, e.g., European Molecular Biology Laboratory (EMBL) and 
DNA Database of Japan (DDBJ). 

Amend the paragraph starting at page 31, line 27: 
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Transmembrane proteins may contain from one to many transmembrane domains. For 
example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases and 
receptor serine/threonine protein kinases contain a single transmembrane domain. However, 
various other proteins including channels, pumps, and adenylyl cyclases contain numerous 
transmembrane domains. Many important cell surface receptors such as G protein coupled 
receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they contain 7 
membrane spanning regions. Characteristics of transmembrane domains include approximately 
20 consecutive hydrophobic amino acids that may be followed by charged amino acids. 
Therefore, upon analysis of the amino acid sequence of a particular protein, the localization and 
number of transmembrane domains within the protein may be predicted (see, e.g. PSORT web 
site fettpr^psort.nibb. ac.jp/). 

Amend the paragraph starting at page 33, line 15: 

The metastatic colorectal cancer nucleic acid sequences of the invention, e.g., the 
sequences in Tables 1-26, can be fragments of larger genes, i.e., they are nucleic acid segments. 
"Genes" in this context includes coding regions, non-coding regions, and mixtures of coding and 
non-coding regions. Accordingly, as will be appreciated by those in the art, using the sequences 
provided herein, extended sequences, in either direction, of the metastatic colorectal cancer 
genes can be obtained, using techniques well known in the art for cloning either longer 
sequences or the full length sequences; see Ausubel, et al 9 supra. Much can be done by 
informatics and many sequences can be clustered to include multiple sequences corresponding to 
a single gene, e.g., systems such as UniGene (see, ht^/www.ncbi.nlm.nih.gov/unigene/). 

Amend the paragraph starting at page 36, line 21 : 

Alternatively, the oligonucleotides may be synthesized on the surface, as is known in the art. 
For example, photoactivation techniques utilizing photopolymerization compounds and 
techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in situ, 
using well known photolithographic techniques, such as those described in WO 95/251 16; WO 
95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of which 
are expressly incorporated by reference; these methods of attachment form the basis of the 
Affim e trix Affymetrix GenoChip™ GENECHIP® (DNA microarrav chip) technology. 
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Amend the paragraph starting at page 51, line 4: 

"Differential expression," or grammatical equivalents as used herein, refers to both 
qualitative as well as quantitative differences in the genes' temporal and/or cellular expression 
patterns within and among the cells. Thus, a breast cancer gene can qualitatively have its 
expression altered, including an activation or inactivation, in, for example, normal versus breast 
cancer tissue. That is, genes may be turned on or turned off in a particular state, relative to 
another state. As is apparent to the skilled artisan, any comparison of two or more states can be 
made. Such a qualitatively regulated gene will exhibit an expression pattern within a state or cell 
type which is detectable by standard techniques in one such state or cell type, but is not 
detectable in both. Alternatively, the determination is quantitative in that expression is increased 
or decreased; that is, the expression of the gene is either upregulated, resulting in an increased 
amount of transcript, or downregulated, resulting in a decreased amount of transcript. The 
degree to which expression differs need only be large enough to quantify via standard 
characterization techniques as outlined below, such as by use of Affymetrix G e n e Chip™ 
GENECHIP® (DNA microarrav chip) expression arrays, Lockhart, Nature Biotechnology, 
14:1675-1680 (1996), hereby expressly incorporated by reference. Other techniques include, but 
are not limited to, quantitative reverse transcriptase PCR, Northern analysis and RNase 
protection. As outlined above, preferably the change in expression (i.e. upregulation or 
downregulation) is at least about 50%, more preferably at least about 100%, more preferably at 
least about 150%, more preferably, at least about 200%, with from 300 to at least 1000% being 
especially preferred. 

Amend the paragraph starting at page 64, line 4: 

Generally, in a preferred embodiment of the methods herein, the breast cancer protein or the 
candidate agent is non-diffusably bound to an insoluble support having isolated sample receiving 
areas (e.g. a microtiter plate, an array, etc.). It is understood that alternatively, soluble assays 
known in the art may be performed. The insoluble supports may be made of any composition to 
which the compositions can be bound, is readily separated from soluble material, and is 
otherwise compatible with the overall method of screening. The surface of such supports may be 
solid or porous and of any convenient shape. Examples of suitable insoluble supports include 
microtiter plates, arrays, membranes and beads. These are typically made of glass, plastic (e.g., 
polystyrene), polysaccharides, nylon or nitrocellulose, t e flon™ TEFLON® (synthetic resinous 
fluorine-containing polymers) , etc. Microtiter plates and arrays are especially convenient 
because a large number of assays can be carried out simultaneously, using small amounts of 
reagents and samples. The particular manner of binding of the composition is not crucial so long . 
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as it is compatible with the reagents and overall methods of the invention, maintains the activity 
of the composition and is nondiffusable. Preferred methods of binding include the use of 
antibodies (which do not sterically block either the ligand binding site or activation sequence 
when the protein is bound to the support), direct binding to "sticky" or ionic supports, chemical 
crosslinking, the synthesis of the protein or agent on the surface, etc. Following binding of the 
protein or agent, excess unbound material is removed by washing. The sample receiving areas 
may then be blocked through incubation with bovine serum albumin (BSA), casein or other 
innocuous protein or other moiety. 

Amend page 180, line 2 as follows: 
TABLE [[1-]]20A 

Amend page 180, line 4 as follows: 
Table [[1-]]20A 

Amend page 205, line 2 as follows: 
TABLE [[1-]]20B 

Amend page 205, line 4 as follows: 
Table [[1-]]20B 
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